AITopics | black-box model

Conformal Prediction using Conditional Histograms

Neural Information Processing SystemsMay-1-2026, 02:04:56 GMT

This paper develops a conformal method to compute prediction intervals for nonparametric regression that can automatically adapt to skewed data. Leveraging black-box machine learning algorithms to estimate the conditional distribution of the outcome using histograms, it translates their output into the shortest prediction intervals with approximate conditional coverage. The resulting prediction intervals provably have marginal coverage in finite samples, while asymptotically achieving conditional coverage and optimal length if the black-box model is consistent. Numerical experiments with simulated and real data demonstrate improved performance compared to state-of-the-art alternatives, including conformalized quantile regression and other distributional conformal prediction approaches.

artificial intelligence, machine learning, prediction interval, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

22b111819c74453837899689166c4cf9-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 01:54:25 GMT

artificial intelligence, explanation method, machine learning, (13 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

1387a00f03b4b423e63127b08c261bdc-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 19:13:05 GMT

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
(3 more...)

Add feedback

Concept frustration: Aligning human concepts and machine representations

Parisini, Enrico, Soelistyo, Christopher J., Isaac, Ahab, Barp, Alessandro, Banerji, Christopher R. S.

arXiv.org Machine LearningApr-1-2026

Aligning human-interpretable concepts with the internal representations learned by modern machine learning systems remains a central challenge for interpretable AI. We introduce a geometric framework for comparing supervised human concepts with unsupervised intermediate representations extracted from foundation model embeddings. Motivated by the role of conceptual leaps in scientific discovery, we formalise the notion of concept frustration: a contradiction that arises when an unobserved concept induces relationships between known concepts that cannot be made consistent within an existing ontology. We develop task-aligned similarity measures that detect concept frustration between supervised concept-based models and unsupervised representations derived from foundation models, and show that the phenomenon is detectable in task-aligned geometry while conventional Euclidean comparisons fail. Under a linear-Gaussian generative model we derive a closed-form expression for Bayes-optimal concept-based classifier accuracy, decomposing predictive signal into known-known, known-unknown and unknown-unknown contributions and identifying analytically where frustration affects performance. Experiments on synthetic data and real language and vision tasks demonstrate that frustration can be detected in foundation model representations and that incorporating a frustrating concept into an interpretable model reorganises the geometry of learned concept representations, to better align human and machine reasoning. These results suggest a principled framework for diagnosing incomplete concept ontologies and aligning human and machine conceptual reasoning, with implications for the development and validation of safe interpretable AI for high-risk applications.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2603.29654

Country:

Europe > United Kingdom (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (0.67)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.54)
(2 more...)

Add feedback

DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection

Neural Information Processing SystemsMar-18-2026, 18:39:41 GMT

Large language models (LLMs) have the potential to generate texts that pose risks of misuse, such as plagiarism, planting fake reviews on e-commerce platforms, or creating inflammatory false tweets. Consequently, detecting whether a text is generated by LLMs has become increasingly important. Existing high-quality detection methods usually require access to the interior of the model to extract the intrinsic characteristics. However, since we do not have access to the interior of the black-box model, we must resort to surrogate models, which impacts detection quality. In order to achieve high-quality detection of black-box models, we would like to extract deep intrinsic characteristics of the black-box model generated texts.

artificial intelligence, large language model, natural language, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Defending against Data-Free Model Extraction by Distributionally Robust Defensive Training

Neural Information Processing SystemsMar-17-2026, 21:03:00 GMT

Data-Free Model Extraction (DFME) aims to clone a black-box model without knowing its original training data distribution, making it much easier for attackers to steal commercial models. Defense against DFME faces several challenges: (i) effectiveness; (ii) efficiency; (iii) no prior on the attacker's query data distribution and strategy. However, existing defense methods: (1) are highly computation and memory inefficient; or (2) need strong assumptions about attack data distribution; or (3) can only delay the attack or prove a model theft after the model stealing has happened. In this work, we propose a Memory and Computation efficient defense approach, named MeCo, to prevent DFME from happening while maintaining the model utility simultaneously by distributionally robust defensive training on the target victim model. Specifically, we randomize the input so that it: (1) causes a mismatch of the knowledge distillation loss for attackers; (2) disturbs the zeroth-order gradient estimation; (3) changes the label prediction for the attack query data. Therefore, the attacker can only extract misleading information from the black-box model. Extensive experiments on defending against both decision-based and score-based DFME demonstrate that MeCo can significantly reduce the effectiveness of existing DFME methods and substantially improve running efficiency.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback